Emerging Trends and Technologies in Statistics and Data Science

Dr. Thiyanga S. Talagala
Department of Statistics, Faculty of Applied Sciences
University of Sri Jayewardenepura, Sri Lanka

Must-know technologies and tools

Programming Languages

  • Python
  • R
  • Julia

Python

Python IDEs

  • Jupyter Notebook

  • VS Code

  • PyCharm

  • Google Colab

Python Data Wrangling Libraries

  • pandas

  • numpy

  • polars

  • dask

  • pyjanitor

Python Data Visualization Libraries

  • matplotlib

  • seaborn

  • plotly

  • altair

  • folium

  • plotnine

Python Statistics Libraries

  • scipy.stats

  • statsmodels

  • sympy

Python Machine Learning Libraries

  • scikit-learn

  • XGBoost / LightGBM

  • TensorFlow / Keras

  • PyTorch

  • CatBoost

Python Web Development Frameworks and Model Deployment

  • django

  • Flask/ FastAPI

  • Docker

  • Kubernetes

R

Essential R Libraries

  • tidyverse

  • janitor

  • lubridate

  • data.table

R Data Visualization Packages

  • ggplot2

  • patchwork

  • ggthemes

  • plotly

  • leaflet

  • ggridges

R Machine Learning and Modelling packages

  • caret

  • tidymodels

  • randomForest

  • xgboost

  • lightgbm

  • glmnet

  • rpart

  • keras

R Time Series Analysis & Forecasting Packages

  • tsibble

  • fable

  • prophet

  • timetk

R Text Mining & Natural Language Processing (NLP) Packages

  • tm

  • tidytext

  • quanteda

  • udpipe

  • textrecipes

Web scrapping and APIs in R

  • rvest

  • httr

  • jsonlite

Reproducible Reporting & Dashboards in R.

  • Quarto
  • rmarkdown
  • knitr
  • flexdashboard
  • shiny
  • gt

Spatial Data Analysis in R

  • sf

  • sp

  • raster

  • tmap

  • terra

  • gstat

Good to know

Cloud Computing & Big Data Technologies

  • Google Cloud Platform (GCP)

  • Amazon Web Services (AWS)

  • Microsoft Azure

  • Hadoop & Apache Spark

Databases & Data Engineering

  • SQL Databases – PostgreSQL, MySQL, SQLite

  • NoSQL Databases – MongoDB, Cassandra

  • Data Warehouses – Snowflake, BigQuery, Redshift.

Pre-trained Classification Models

  • VGG 16

  • YOLO (You Only Look Once)

  • Vision Transformer (ViT)

  • Pretrained Model Libraries

  • TensorFlow Hub

  • Torchvision Models

  • Explainable AI (XAI) and Interpretable Machine Learning

  • Federated Learning and Privacy-Preserving Analytics

  • Deep Generative Models (DGMs)

  • Real-Time and Streaming Data Analytics

  • Automated Machine Learning

  • Causal Inference & Uplift Modeling

The Future of Statistics, Data Science & AI

These trends highlight a shift toward trustworthy, scalable, and responsible AI.